4/25/2024

Introduction and Roadmap

  • Geospatial data ethics
  • Overview of basic remote sensing concepts
  • Google Earth Engine and geospatial “big data”
  • Tutorial: Working with satellite imagery products in R and GEE

Geospatial Data Ethics: Visualization and Analysis

  • It is easy to manipulate maps (even unintentionally) to support a given narrative or hypothesis (i.e. “lying with maps”).
    • Choices about classification schemes, data standardization methods, map projections, aggregation units etc. can dramatically influence the conclusions an audience will draw from a map (Deluca and Nelson).
    • Important to have an understanding of the ways in which maps function as rhetorical devices, and how these choices can affect interpretations, so we can be intentional about these choices, and minimize the possibility of misleading people.
  • Maps can potentially convey sensitive information, and lead to disclosure risks

Geospatial Data Ethics: Privacy and Disclosure Risk

  • “Whether through the publication of data (including maps) or the direct transfer of data files, the dissemination of data to other researchers poses confidentiality protection prob- lems for spatially explicit social survey or census data” (VanWey, Rindfuss, Gutmann, Entwisle, Balk 2005, 15338).
    • When working with geolocated survey datasets or data collected using mobile technology (for example) it can be challenging to balance need to protect individual privacy with the need for granular data for analysis (and the need to share this data with others for the sake of scholarly transparency)
    • Some techniques, such as “geomasking” can help researchers to work with and disseminate this data in an ethical manner.
    • If you need to share geolocated data that can potentially identify individuals, but don’t want to publicly disseminate it, certain repositories (like ICPSR can help facilitate this)

Remote Sensing: Overview

  • Remote sensing techniques allow us to learn about what activities and processes taking place on the earth’s surface by extracting information about the radiation given off by objects on the earth’s surface
  • A pixel in a satellite image contains information about the reflectance (at different parts of the electromagnetic spectrum) of surfaces or objects in the geographic area covered by that pixel; imagery datasets are therefore a type of raster.
  • “Remote sensing” refers to the process of collecting, extracting, visualizing, and analyzing this pixel-level reflectance data
  • Remote sensing techniques are used to create many of the off-the-shelf raster datasets that are widely used in applied work. For example, the population count rasters we worked with last class (clipped to NYC) are created by using remotely sensed data (that provides information on things such as land cover and night lights) that are fed into complex models that can make estimates about grid-cell population counts using these data (see Stevens, Gaughan, Linard, and Tatem, 2015).

Remote Sensing: Collecting Data

  1. Sunlight strikes the earth’s surface
  2. Some of it is absorbed, and some reflected back into space, where it is recorded by a sensor on a satellite (but could also be a drone, weather balloon etc.)

Remote Sensing: Collecting Data, continued

  • Due to their different chemical compositions, different objects or surfaces reflect sunlight in different amounts at different parts of the electromagnetic spectrum.
  • In other words, various objects have different “spectral signatures”, and these different spectral signatures are recorded by the sensor positioned on the satellite. We can use these differing spectral signatures to make inferences about what sort of object is within a given pixel in an image.

Remote Sensing: Visualizing spectral signatures

  • This diagram is a visual representation of the differing spectral signatures of common objects found on the surface of the earth.
    • For example, at a wavelength of 800nm (within the near-infrared region of the electromagnetic spectrum), snow and ice have high reflectance signatures, while clear water has a very low reflectance signature. We can use these differing spectral signatures to make inferences about what sort of object is within a given pixel in a satellite image.

Using spectral signatures to classify images/pixels: Identifying live vegetation

  • Let’s say you want to identify pixels in a satellite image with a high degree of live vegetation.
  • Different bands in a satellite image raster dataset correspond to different regions of the electromagnetic spectrum (these bands correspond to Landsat 7)
  • How could you use raster calculations to generate a new raster that will allow us to infer the amount of live vegetation in a given grid cell? Use the diagram below to inform your answer

The Normalized Difference Vegetation Index (NDVI)

  • The intuition for the Normalized Difference Vegetation Index (NDVI) is basically what many of you would have inferred based on an examination of the spectral signature for natural vegetation.
  • Live vegetation (due to its cellular structure) reflects large amounts of energy in the near-infrared region of the spectrum (Band 4, in Landsat 7), but reflects relatively much less energy in the red region of the spectrum (Band 3, in Landsat 7).
  • The basic idea, then, is that we can subtract the reflectance values in the red band (Band 4) from the reflectance values in the near-infrared band (Band 7); where this difference is large in a given pixel, we can infer that it contains a large amount of live green vegetation (since a large difference is implied by green vegetation’s spectral properties).
  • Dividing this difference by the sum of the reflectances in these bands, yields a normalized vegetation index (i.e. the NDVI) where cell values range from -1 (no vegetation) to 1 (highest possible density of green vegetation).

The Normalized Difference Vegetation Index (NDVI), continued

In equation form,

NDVI=\(\frac{Near Infrared Band-Red Band}{Near Infrared Band + Red Band}\)

If you were using a Landsat 7 image to derive the NDVI, this would mean

NDVI=\(\frac{Band 4-Band 3}{Band4 + Band3}\)

Satellite imagery and “big data”

  • In the hands-on exercise/lab, you will get some practice working with an NDVI raster; hopefully, you now have a sense of how the NDVI raster you’ll work with was created using satellite imagery.
  • You’ll also get some practice working with a land-use raster, which was also generated using satellite imagery, and use of differential spectral signatures to classify pixels with respect to various land uses.
  • While it is possible to work with satellite imagery, and the raster datasets derived from them, in R you may encounter problems stemming from the extremely large size of these datasets
  • If you need to process, visualize, calculate, or model with extremely large satellite and satallite-derived datasets, you may find yourself quickly running out of processing power on your local machine

Satellite imagery and “big data”, continued

  • In recent years, “big data” in the realm of remote sensing has rapidly transformed from obstacle to opportunity, with the emergence of high performancy computing and cloud data infrastructures that make it possible to process and analyze data at the petabyte scale.
    • Putting “big data” in perspective: Until very recently, only about 4% of the total Landsat data archive had actually been analyzed, given the lack of computational resources
  • If you find yourself needing these computational resources, one place to seek assistance is the Research Computing department at CU
  • However, another option is to use Google Earth Engine, a “planetary-scale platform for Earth science data and analysis” developed by Google
  • Earth Engine is free for academic and non-profit use (though it requires registration), and effectively gives you the ability to tap into Google’s supercomputing resources directly from your web browser.

The uses of Google Earth Engine

  • Data archive
  • Useful when dealing with very large datasets, or you want to implement complex algorithms that require large amounts of computational power
  • Visualization capabilities are less well-developed

Remote Sensing: Social Science Applications

As you might imagine, remote sensing is an essential tool for earth scientists and other natural scientists interested in biophysical processes. While it hasn’t traditionally been a commonly used method in social science, that is quickly changing:

  • Substantive research questions about the interaction between biosphere and society
  • Research design (biophysical variables as possible exogenous sources of variation); see Donaldson and Storeygard (2016) for examples
  • Provide information where official statistics are not reliable (i.e. North Korea)

Tutorial

  • Get some practice extracting and working with datasets derived from (MODIS) satellite imagery in R Studio
    • One section on NDVI
    • One section on land cover
  • Exploring Google Earth Engine